Model Selection

Global Average Pooling

# Global Average Pooling

Vit So400m Patch16 Siglip Gap 384.v2 Webli

A ViT image encoder based on SigLIP 2, utilizing global average pooling, with the attention pooling head removed, suitable for image feature extraction tasks.

Image Classification

Vit So400m Patch16 Siglip Gap 256.v2 Webli

ViT image encoder based on SigLIP 2, using global average pooling, with attention pooling head removed, suitable for image feature extraction tasks.

Vit So400m Patch14 Siglip Gap 378.v2 Webli

Vision Transformer model based on SigLIP 2 architecture, pre-trained on WebLI dataset, with attention pooling head removed and global average pooling applied

Image Classification

Vit So400m Patch14 Siglip Gap 224.v2 Webli

A ViT image encoder based on SigLIP 2, employing global average pooling with the attention pooling head removed, suitable for image feature extraction tasks.

Image Classification

Vit Large Patch16 Siglip Gap 512.v2 Webli

A vision Transformer model based on SigLIP 2 architecture, designed for image feature extraction, using Global Average Pooling (GAP) instead of attention pooling head

Image Classification

Vit Large Patch16 Siglip Gap 384.v2 Webli

A vision Transformer model based on the SigLIP 2 architecture, featuring a Global Average Pooling (GAP) variant that removes the attention pooling head, suitable for image feature extraction tasks.

Vit Giantopt Patch16 Siglip Gap 384.v2 Webli

A ViT image encoder based on SigLIP 2, utilizing global average pooling and removing the attention pooling head, suitable for image feature extraction tasks.

Image Classification

Vit Giantopt Patch16 Siglip Gap 256.v2 Webli

SigLIP 2 ViT image encoder, using global average pooling, with attention pooling head removed, designed specifically for timm

Image Classification

Vit Base Patch32 Siglip Gap 256.v2 Webli

A vision Transformer model based on SigLIP 2, using Global Average Pooling (GAP) instead of attention pooling head for image encoding

Vit Base Patch16 Siglip Gap 512.v2 Webli

A ViT image encoder based on SigLIP 2, using global average pooling with the attention pooling head removed, suitable for image feature extraction tasks.

Image Classification

Vit Base Patch16 Siglip Gap 384.v2 Webli

ViT image encoder based on SigLIP 2, using Global Average Pooling (GAP) instead of attention pooling head, suitable for image feature extraction tasks.

Image Classification

Vit Base Patch16 Siglip Gap 256.v2 Webli

A ViT image encoder based on SigLIP 2, employing global average pooling with the attention pooling head removed, suitable for image feature extraction.

Multimodal Fusion

Vit Base Patch16 Siglip Gap 224.v2 Webli

Vision Transformer model based on SigLIP 2, utilizing global average pooling for image features

Image Classification

Vit So400m Patch16 Siglip Gap 512.v2 Webli

A ViT image encoder based on SigLIP 2, utilizing global average pooling, suitable for vision-language tasks.

Vit So400m Patch14 Siglip Gap 896.pali Pt

Vision model based on SigLIP image encoder, employing global average pooling, part of the PaliGemma project

Vit So400m Patch14 Siglip Gap 896.pali2 3b Pt

A vision model based on the SigLIP image encoder, employing global average pooling, and part of the PaliGemma2 project

Vit So400m Patch14 Siglip Gap 448.pali Mix

A vision-language model based on the SigLIP image encoder, utilizing global average pooling, suitable for multimodal tasks.

Vit Large Patch16 Siglip Gap 384.webli

A vision Transformer model based on SigLIP, utilizing global average pooling, suitable for image feature extraction tasks.

Image Classification

Vit Base Patch16 Siglip Gap 224.webli

Vision Transformer model based on SigLIP, containing only the image encoder part, employing a global average pooling strategy

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase